By the end of this lesson, you will be able to:
Data visualization is the representation of data through use of common graphics, such as charts, plots, info-graphics, and even animations. These visual displays of information communicate complex data relationships and data-driven insights in a way that is easy to understand. This technique mainly use for
For more details see [https://r-coder.com/plot-r/]
Syntax
plot(x, y, ...)
- the following arguments are optional
for dot plot: type = 'p' (default)
for line chart: type = 'l'
to assign plot title: main = "title", a charactor field
xlab = "Name of X varaible", a charactor field
ylab = "Name of y varaible", a charactor field
xlim = limit of x values, a numerice range
ylim = limit of y values, a numerice range
A scatter chart (or a scatter plot) is a chart that shows the relationship between two quantitative variables.
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
x = iris$Sepal.Length
y= iris$Sepal.Width
plot(x, y, type = 'p', xlim = range(x), ylim = range(y),
xlab = "Sepal.Length",
ylab = "Sepal.Width",
main = "Association of Sepal.Length and Sepal.Width of iris data")For more details see here [http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf]
For more details see here [https://www.r-bloggers.com/2021/06/r-plot-pch-symbols-different-point-shapes-in-r/]
x = iris$Sepal.Length
y= iris$Sepal.Width
plot(x, y, col = iris$Species, pch = 10,
main = "Color by Species")x = iris$Sepal.Length
y= iris$Sepal.Width
y1 = iris$Petal.Width
plot(x, y, ylim = range(c(y, y1)), col = iris$Species, pch = 10)
points(x, y1, col = iris$Species, pch = 20)In the previous example we observed the association of Sepal.Width and Petal.Width with x-variable Sepal.Length. Now let’s observe the association of those y-variables with x-variables Sepal.Length and Petal.Length.
It is very easy to combine multiple plots into one overall graph in R, using the par(mfrow = c(i, j)) .
par(mfrow = c(i, j)): combines the plots
i indicates number of rows
j indicates number of columns
par(mfrow = c(2, 2))
#plot 1
x1 = iris$Sepal.Length
y1= iris$Sepal.Width
y2 = iris$Petal.Length
y3 = iris$Petal.Width
plot(x1, y1, xlab = "Sepal.Length", ylab = "Sepal.Width", col = 'red', pch = 19)
plot(x1, y2, xlab = "Sepal.Length", ylab = "Petal.Length", col = 'green', pch = 20)
plot(x1, y3, xlab = "Sepal.Length", ylab = "Petal.Width", col = 'black', pch = 21)It should be noted that in RStudio the graph will be displayed in the pane layout and figure size can be adjusted in r-chunk by assigning values for fig.width and fig.height.
par(mfrow = c(1, 2))
#plot 1
x1 = iris$Sepal.Length
y1= iris$Sepal.Width
y2 = iris$Petal.Width
plot(x1, y1, ylim = range(c(y1, y2)), col = 'red', pch = 18)
points(x1, y2, col = 'blue', pch = 20)
#plot 2
x2= iris$Petal.Length
y3 = iris$Sepal.Width
y4 = iris$Petal.Width
plot(x2, y3, ylim = range(c(y3, y4)), col = 'red', pch = 18)
points(x2, y4, col = 'blue', pch = 20)We can change the parameters mai, mar, tcl. Type help(par) in R-console for more details.
mai: A numerical vector of the form c(bottom, left, top, right) which gives the margin size specified in inches.
mar: A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot. The default is c(5, 4, 4, 2) + 0.1.
For more details see [https://datavoreconsulting.com/post/spacing-of-panel-figures-in-r/]
par(mfrow = c(2, 2), tcl=-0.01, mai=c(0.5,0.5,0.5,0.5))
#plot 1
x = iris$Sepal.Length
y1= iris$Sepal.Width
y2 = iris$Petal.Width
plot(x, y1, ylim = range(y1), xlab = "Sepal.Length",
ylab = "Sepal.Width", col = "black", pch = 18)
#plot2
plot(x, y2, ylim = range(y2), xlab = "Sepal.Length",
ylab = "Petal.Width", col = 'blue', pch = 18)
#plot 3 and 4
x1 = iris$Petal.Length
y3= iris$Sepal.Width
y4 = iris$Petal.Width
plot(x1, y3, ylim = range(y3), xlab = "Petal.Length",
ylab = "Sepal.Width", col = "red", pch = 18)
#plot4
plot(x1, y4, ylim = range(y4), xlab = "Petal.Length",
ylab = "Petal.Width", col = 'green', pch = 18)x1 = iris$Sepal.Length
x2 = iris$Petal.Length
idx = 1: length(x1)
plot(idx, x1, type = "l", xlab = "", ylab = "", col = 'red', lty = 1,
main = "Sepal.Length vs Petal.Length comarision")
lines(idx, x2, type = "l", xlab = "", ylab = "", lty = 2, col = 'blue')x1 = iris$Sepal.Length
x2 = iris$Petal.Length
idx = 1: length(x1)
plot(idx, x1, type = "l", xlab = "", ylab = "", ylim = range(x1, x2),
lty = 1, col = 'red', main = "Sepal.Length vs Petal.Length comarision")
lines(idx, x2, type = "l", xlab = "", ylab = "", lty = 2, col = 'blue')When we are comparing multiple variables using trace plot or scatter plot, it is vary hard to identify the the visual of related variable. So, assigning legend is important in such of cases.
For more details see [https://r-coder.com/add-legend-r/]
x1 = iris$Sepal.Length
x2 = iris$Petal.Length
idx = 1: length(x1)
plot(idx, x1, type = "l", xlab = "", ylab = "", ylim = range(x1, x2),
lty = 1, col = 'red', main = "Sepal.Length vs Petal.Length comarision")
lines(idx, x2, type = "l", xlab = "", ylab = "", lty = 2, col = 'blue')
legend(x = "topleft", # Position
legend = c("Sepal.Length", "Petal.Length"), # Legend texts
lty = c(1, 2), # Line types
col = c('red', 'blue'), # Line colors
lwd = 2) # Line width# Make the window wider than taller
# Save current graphical parameters
x1 = iris$Sepal.Length
x2 = iris$Petal.Length
idx = 1: length(x1)
plt =function() {
plot(idx, x1, type = "l", xlab = "", ylab = "", ylim = range(x1, x2),
lty = 1, col = 'red', main = "Sepal.Length vs Petal.Length comarision")
lines(idx, x2, type = "l", xlab = "", ylab = "", lty = 2, col = 'blue')
}
# Save current graphical parameters
opar <- par(no.readonly = TRUE)
# Change the margins of the plot (the fourth is the right margin)
par(mar = c(5, 5, 5, 11))
plt()
legend(x = "topright",
inset = c(-.2, 0), # You will need to fine-tune the first
# value depending on the windows size
legend = c("Sepal.Length", "Petal.Length"), # Legend texts
lty = c(1, 2),
col = c('red', 'blue'), # Line colors
lwd = 2,
xpd = TRUE) # You need to specify this graphical parameter toA bar plot is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to their corresponding values (or count). The bars can be plotted vertically or horizontally.
##
## 4 6 8
## 11 7 14
# One row, two columns
par(mfrow = c(1, 2))
# Absolute frequency barplot
barplot(car_counts_by_cyl, main = "Absolute frequency",
col = rainbow(3))
# Relative frequency barplot
barplot(prop.table(car_counts_by_cyl) * 100, main = "Relative frequency (%)",
col = rainbow(3)) Boston311_2023_data =read.csv("https://data.boston.gov/dataset/8048697b-ad64-4bfc-b090-ee00169f2323/resource/e6013a93-1321-4f2a-bf91-8d8a02f1e62f/download/tmp518q5snq.csv")library(stringr)
library(dplyr)
Boston311_2023_data$Parking_Enforcement_status <- str_detect(Boston311_2023_data$case_title, regex("\\bParking Enforcement\\b"))
Parking_Enforcement_by_nbd <- Boston311_2023_data %>%
group_by(neighborhood) %>%
summarise(nbd_count_Parking_Enforcement = n()) %>%
arrange(desc(nbd_count_Parking_Enforcement))
head(Parking_Enforcement_by_nbd, 10)top_10_nbd = Parking_Enforcement_by_nbd[1:10, ]
barplot(names = top_10_nbd$neighborhood, height = top_10_nbd$nbd_count_Parking_Enforcement,
col = rainbow(10), las = 2)par(mar, mgp, las)
par(mar=c(5.1, 4.1, 4.1, 2.1), mgp=c(3, 1, 0), las=0)
par sets or adjusts plotting parameters. Here we consider the following three parameters: margin size (mar), axis label locations (mgp), and axis label orientation (las).
mar – A numeric vector of length 4, which sets the margin sizes in the following order: bottom, left, top, and right. The default is c(5.1, 4.1, 4.1, 2.1).
mgp – A numeric vector of length 3, which sets the axis label locations relative to the edge of the inner plot window. The first value represents the location the labels (i.e. xlab and ylab in plot), the second the tick-mark labels, and third the tick marks. The default is c(3, 1, 0).
las – A numeric value indicating the orientation of the tick mark labels and any other text added to a plot after its initialization. The options are as follows: always parallel to the axis (the default, 0), always horizontal (1), always perpendicular to the axis (2), and always vertical (3).
par(mar = c(4, 16, 2, 2))
top_10_nbd = Parking_Enforcement_by_nbd[1:10, ]
barplot(names = top_10_nbd$neighborhood, height = top_10_nbd$nbd_count_Parking_Enforcement,
col = rainbow(10), horiz = TRUE, las = 1)var1 = iris$Sepal.Length
cut_off = c(0, 5, 6, 7 , 8)
catgory = c("low", "low_mid", "high_mid", "high")
Sepal_Len_cat1 = cut(var1, breaks = cut_off, labels = catgory)
iris_new = cbind(iris, Sepal_Len_cat1)
barplot(table(iris_new$Sepal_Len_cat1), col = rainbow(4), legend.text = levels(iris_new$Sepal_Len_cat1))# With Legend# Variable am to factor
am = mtcars$am
am <- factor(am)
# Change factor levels
levels(am) <- c("Automatic", "Manual")
summary_data <- tapply(mtcars$hp, list(cylinders = mtcars$cyl,
transmission = am),FUN = mean, na.rm = TRUE)
summary_data## transmission
## cylinders Automatic Manual
## 4 84.66667 81.8750
## 6 115.25000 131.6667
## 8 194.16667 299.5000
par(mar = c(5, 5, 4, 10))
barplot(summary_data, xlab = "Transmission type",
main = "Horsepower mean",
col = rainbow(3),
beside = TRUE,
legend.text = rownames(summary_data),
args.legend = list(title = "Cylinders", x = "topright",
inset = c(-0.20, 0)))par(mar = c(5, 5, 4, 10), las = 0)
barplot(summary_data,
main = "Horsepower mean",
xlab = "Transmission type", ylab = "HP mean",
col = c('red', 'blue', 'green'),
legend.text = rownames(summary_data),
beside = FALSE, # Stacked bars (default)
args.legend = list(title = "Cylinders", x = "topright",
inset = c(-0.3, 0)))A pie chart is used to represent data in numerical proportions. Pie chart in R is created using pie() function.
# cyl-wise distribution of data using pie-chart
count_cars <- mtcars %>%
group_by(cyl) %>%
summarise(count = n())
car_type <- paste(count_cars$cyl, "cyl")
count <- count_cars$count
# calculating percentage participation
perc <- round(count/sum(count)* 100, 2)
# add frequency or proportion to country names to create labels
labels <- paste(car_type, perc,'%')
pie(count, labels = labels,radius = 1, col = hcl.colors(n = 3, palette = 'ag_Sunset'), border = 'gray', main = "Pie chart in R")# plot function is used to plot
# the data type with "n" is used to remove the plotted dots
# to remove the plotted data
plot(1, type = 'n', xlab = "",
ylab = "", xlim = c(0, 5),
ylim = c(0, 5))
abline(h = 2, col = 'red')# plot function is used to plot
# the data type with "n" is used to remove the plotted dots
# to remove the plotted data
plot(1, type = "n", xlab = "",
ylab = "", xlim = c(0, 5),
ylim = c(0, 5))
abline(v = 2, col = 'red')# plot function is used to plot
# the data type with "n" is used to remove the plotted dots
# to remove the plotted data
plot(1, type = "n", xlab = "",
ylab = "", xlim = c(0, 5),
ylim = c(0, 5))
abline(h = 2.5, v = 2, col = 'red')# plot function is used to plot
# the data type with "n" is used to remove the plotted dots
# to remove the plotted data
plot(1, type = "n", xlab = "",
ylab = "", xlim = c(0, 5),
ylim = c(0, 5))
abline(a = 0, # Intercept
b = 1, col = 'red') # Slope
abline(a = 5, # Intercept
b = -1, col = 'blue') # SlopeHistogram is the most widely used graph to represent quantitative (or numerical) data mostly for the continuous in nature.
Syntax
hist(x,....)
hist(x, breaks = "Sturges",
freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE,
density = NULL, angle = 45, col = NULL, border = NULL,
main = paste("Histogram of" , xname),
xlim = range(breaks), ylim = NULL,
xlab = xname, ylab,
axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, warn.unused = TRUE, …)
hist(iris$Sepal.Length, breaks = 15, xlab = 'Sepal.Length', ylab = 'Relative Frequency',probability = TRUE, col = 'gray', main = "Histogram of Sepal.Length of Iris data")par(mfrow = c(2, 2))
x <- iris$Sepal.Length # First group
y <- iris$Petal.Length # Second group
hist(x, main = "Histogram of Sepal.Length")
hist(y, main = "Histogram of Petal.Length")
# Combine plot
hist(x, xlim = c(0, 8),ylim = c(0, 50), main = "Histogram of Two variables")
hist(y, add = TRUE, col = rgb(1, 0, 0, alpha = 1))par(mfrow = c(1, 2))
x <- iris$Sepal.Length # First group
y <- iris$Petal.Length # Second group
hist(x, probability = TRUE, main = "Histogram of Sepal.Length")
lines(density(x), lwd = 2, col = 'red')
hist(y, probability = TRUE, main = "Histogram of Petal.Length")
lines(density(y), lwd = 2, col = 'red')x <- iris$Sepal.Length # First group
y <- iris$Petal.Length # Second group
hist(x, ylim = c(0, 0.5), probability = TRUE,
main = "Histogram of Sepal.Length")
x_val = seq(min(x), max(x), length.out = 100)
f_val = dnorm(x_val, mean = mean(x), sd = sd(x))
lines(x_val, f_val, lwd = 2, col = 'red')Box plots (Chambers 1983) are an excellent tools for detecting and illustrating location and variation changes between different groups of data.
boxplot(x, xlab = "Sepal.Length", horizontal = TRUE)
stripchart(x, method = "jitter", pch = 19, add = TRUE, col = "red")IQR = Q3 - Q1
Usual low value, L = Q1 - 1.5*IQR
Usual high value, U = Q3 + 1.5*IQR
Any value outside of the range between L and U considered as outlier
x <- rnorm(50, 20, 5)
x1 <- c(-4, -7, 0, 50, 55) # add few extreme data points
x <- c(x, x1)
boxplot(x)Q1 <- quantile(x, prob = 0.25)
Q3 <- quantile(x, prob = 0.75)
IQR <- Q3 - Q1
L <- Q1 - 1.5*(IQR)
U <- Q3 + 1.5*(IQR)
boxplot(x, horizontal = TRUE, main = "Detection of outlier uising boxpolt ")
abline(v = L, col = 'red')
abline(v = U, col = 'blue')x = iris$Sepal.Length
y = iris$Petal.Length
plot(x, y, pch = 19, col = "gray52")
# Linear fit
abline(lm(y ~ x), col = "orange", lwd = 3)
# Smooth fit
lines(lowess(x, y), col = "blue", lwd = 3)
# Legend
legend("topleft", legend = c("Linear", "Smooth"),
lwd = 3, lty = c(1, 1), col = c("orange", "blue"))It is a pairwise scatter plot, that shows the pairwise association between variables.
#numerical_df <- subset(iris, select = c(Sepal.Length, Sepal.Width,Petal.Length,Petal.Width))
#pairs(numerical_df)
pairs(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris)ggplot2 is one of the most used packages for data visualization in R and it builds plots in layers.
# install.packages("ggplot2")
library(dplyr)
library(ggplot2)
iris %>%
ggplot() +
aes(x = Sepal.Length, y = Sepal.Width) +
geom_point(size=2, shape=10) ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point(aes(colour = Species)) + # Points and color by group
scale_color_discrete("Type") + # Change legend title
xlab("Sepal.Length") + # X-axis label
ylab("Sepal.Width") + # Y-axis label
theme(axis.line = element_line(colour = "black", # Changes the default theme
size = 0.24))## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
ggtitle("Scattor plot in R") +
theme(plot.title = element_text(hjust=0.5)) + # Assign title on center
geom_point(aes(color = Species)) + # Points and color by group
#scale_color_discrete("type") + # Change legend title
xlab("Sepal.Length") + # X-axis label
ylab("Sepal.Width") + # Y-axis label
theme(axis.line = element_line(colour = "red",size = 0.5)) # Changes the default theme (xy-axes)ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_abline(intercept = 3, slope = 0 ) +
ggtitle("Scattor plot in R") +
theme(plot.title = element_text(hjust=0.5)) + # Assign title on center
geom_point(aes(colour = Species)) + # Points and color by group
scale_color_discrete("Species") + # Change legend title
xlab("Sepal.Length") + # X-axis label
ylab("Sepal.Width") + # Y-axis label
theme(axis.line = element_line(colour = "black", # Changes the default theme
size = 0.01))
- Remove the grids
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
ggtitle("Scattor plot in R") +
theme(plot.title = element_text(hjust=0.5)) + # Assign title on center
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
#theme_void() + #remove background
#theme_classic() +#remove background
geom_point(aes(colour = Species)) + # Points and color by group
#scale_color_discrete("Species") + # Change legend title
xlab("Sepal.Length") + # X-axis label
ylab("Sepal.Width") + # Y-axis label
theme(axis.line = element_line(colour = "black", # Changes the default theme
size = 0.5))# Change the line type
ggplot(data=iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_line(linetype = "dashed")# Change the line type
ggplot(data=iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_line(linetype = "solid")+
geom_point()# Change the line type
ggplot(data=iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_line(aes(colour = Species))+
geom_point(aes(colour = Species)) + # Points and color by group
#scale_color_discrete("Species") + # Change legend title
xlab("Sepal.Length") + # X-axis label
ylab("Sepal.Width") # Y-axis labeldf = mtcars %>%
group_by(cyl)%>%
summarise(count = n())
# Basic barplot
ggp <- ggplot(data=df, aes(x = cyl, y = count, fill = factor(cyl))) +
geom_bar(stat="identity", width=1) +
theme_minimal()
ggp# Don't map a variable to y
ggp <- ggplot(mtcars, aes(x=factor(cyl), fill = factor(cyl)))+
geom_bar() +
theme_minimal()
ggptop_10_nbd = Parking_Enforcement_by_nbd[1:10, ]
ggp <- ggplot(top_10_nbd, aes(y=neighborhood, x = nbd_count_Parking_Enforcement, fill = neighborhood ))+
geom_bar(stat="identity") +
scale_colour_manual(name = "neighborhood")+
xlab("Parking enforcement count by Neighborhood") + # X-axis label
ylab("Neighborhood") + # Y-axis label
theme_minimal()
ggptop_10_nbd = Parking_Enforcement_by_nbd[1:10, ]
ggp <- ggplot(top_10_nbd, aes(y=reorder(neighborhood, nbd_count_Parking_Enforcement), x = nbd_count_Parking_Enforcement, fill = neighborhood ))+
geom_bar(stat="identity") +
scale_colour_manual(name = "neighborhood")+
theme_void()
ggpdf = mtcars %>%
group_by(cyl)%>%
summarise(count = n())
df$cyl = as.factor(df$cyl)
# Basic barplot
ggp <- ggplot(data=df, aes(x ='', y = count, fill = cyl)) +
geom_bar(stat="identity", width=0.7) +
theme_minimal()
ggpggp <- ggplot(data=df, aes(x = '', y = count, fill = cyl)) +
geom_bar(stat="identity", width=0.7) +
coord_polar("y", start=0)
ggpdf$perc = round(df$count/sum(df$count),4) *100
ggp <- ggplot(data=df, aes(x = '', y = perc, fill = cyl)) +
geom_col() +
geom_text(aes(label = paste(perc, '%')), color = rep("white", 3),
position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y") +
theme_void()
ggp## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Change colors
p <-ggplot(iris, aes(x=Sepal.Length)) +
geom_histogram(color="black", fill="gray")+
theme_void()
p## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Add mean line
p + geom_vline(aes(xintercept=mean(Sepal.Length)),
color="blue", linetype="dashed", size=1)## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Histogram with density plot
ggplot(iris, aes(Sepal.Length)) +
geom_histogram(aes(y= ..density..), colour="black", fill="white")+
geom_density(alpha=.1, fill="red") + #transparency parameter
theme_minimal()## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Change histogram plot line colors by groups
ggplot(iris, aes(x=Sepal.Length, color=Species)) +
geom_histogram(fill="gray")## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Change histogram plot line colors by groups
ggplot(iris, aes(x=Sepal.Length,fill=Species, color=Species)) +
geom_histogram(position="identity", alpha=0.5)## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
stat_summary <- iris %>%
group_by(Species) %>%
summarise(mean_SepL = mean(Sepal.Length), median_SepL = median(Sepal.Length))
stat_summary <- data.frame(Species = rep(stat_summary$Species, 2), stat = c(stat_summary$mean_SepL, stat_summary$median_SepL), value = rep(c('mean', 'median'), each = 3))
p <- ggplot(iris, aes(x=Sepal.Length))+
geom_histogram(color="black", fill="steelblue")+
facet_grid(Species ~ .) +
geom_vline(data = stat_summary, mapping = aes(xintercept = stat, color = value))
p## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Change outlier, color, shape and size
ggplot(iris, aes(x=Sepal.Length, y=Species)) +
geom_boxplot(outlier.colour="red", outlier.shape=8,
outlier.size=4)